Ordered partitioning reveals extended splice-site consensus information.

نویسندگان

  • Michael Weir
  • Michael Rice
چکیده

Using recently available cDNA and genomic data (Berkeley Drosophila Genome Project; http://www.fruitfly.org), we computed a large sample of 10,057 Drosophila splice sites. An information-theoretic analysis of the nucleotide sequences adjacent to these splice sites showed a strong correlation between the sizes of introns and exons and the levels of information, which is a measure of sequence conservation. The strong correlation permitted us to determine extensive consensus sequences at the donor and acceptor sites of longer introns. These sequences were further refined and extended by examining the information in regions around splice sites that only partially matched the consensus. The correlation between length and information provided the basis for determining alternative consensus arrangements associated with shorter introns, as well as general base-composition preferences that likely promote spliceosome function. We also observed a correlation between information near splice sites and the lengths of nonadjacent introns, indicating that there are long-range effects spanning multiple introns. The ordered partitioning approach used in this analysis may become increasingly useful as large genomic data sets become available.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Expression analysis of an FGFR2 IIIc 5' splice site mutation (1084+3A->G).

S equence variations within splice sites may pose problems in the interpretation of their pathogenic effect, especially when these variations occur outside the highly conserved /gt (donor or 59 site) and ag/ (acceptor or 39 site) consensus dinucleotides that immediately flank most exons. A commonly used method to evaluate the probable effect of a sequence variation on splicing is to calculate t...

متن کامل

A sequence compilation and comparison of exons that are alternatively spliced in neurons.

Alternative splicing is an important regulatory mechanism to create protein diversity. In order to elucidate possible regulatory elements common to neuron specific exons, we created and statistically analysed a database of exons that are alternatively spliced in neurons. The splice site comparison of alternatively and constitutively spliced exons reveals that some, but not all alternatively spl...

متن کامل

The Importance of Window Length in Splice Site Prediction

The performance of gene prediction programs strongly depends on the methods that they use to locate splice sites. Different pattern recognition techniques are available to assess the quality of candidate splice sites, see [1] for an overview and further references. All of these techniques proceed by computing a score derived from the distribution of the nucleotides in the neighbourhood of a spl...

متن کامل

Extended and infinite ordered weighted averaging and sum operators with numerical examples

This study discusses some variants of Ordered WeightedAveraging (OWA) operators and related information aggregation methods. Indetail, we define the Extended Ordered Weighted Sum (EOWS) operator and theExtended Ordered Weighted Averaging (EOWA) operator, which are applied inscientometrics evaluation where the preference is over finitely manyrepresentative works. As...

متن کامل

Identification of two diVerent mutations in the PDS gene in an inbred family with Pendred syndrome

Recently the gene responsible for Pendred syndrome (PDS) was isolated and several mutations in the PDS gene have been identified in Pendred patients. Here we report the occurrence of two diVerent PDS mutations in an extended inbred Turkish family. The majority of patients in this family are homozygous for a splice site mutation (1143-2A→G) aVecting the 3' splice site consensus sequence of intro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genome research

دوره 14 1  شماره 

صفحات  -

تاریخ انتشار 2004